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Abstract. Cloud-computing shares a common pool of resources across customers at a scale 
that is orders of magnitude larger than traditional multi-user systems. Constituent physical 
compute servers are allocated multiple "virtual machines" (VM) to serve simultaneously. 
Each VM user should ideally be unaffected by others' demand. Naturally, this environment 
produces new challenges for the service providers in meeting customer expectations while 
extracting an efficient utilization from server resources. We study a new cloud service metric 
that measures prolonged latency or delay suffered by customers. We model the workload 
process of a cloud server and analyze the process as the customer population grows. The 
capacity required to ensure that average workload does not exceed a threshold over long 
segments is characterized. This can be used by cloud operators to provide service guarantees 
on avoiding long durations of latency. As part of the analysis, we provide a uniform large- 
deviation principle for collections of random variables that is of independent interest. 
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1. Introduction 

Cloud computing is a paradigm shift of multiple orders of magnitude in the pursuit of 
extracting greater utilization of server resources while serving the computing needs of a large 
collection of customers. This has been made possible primarily by the concept of workload 
virtualization wherein individual users operate on virtual machines (VMs), each with modest 
resource requirements, and multiple VMs are served by a single large computing server. 
Cloud service providers achieve greater utilization by over-provisioning VMs on compute 
nodes, acting on the assumption that rarely will multiple customers simultaneously require 
large quantities of resources. 

The resources required over time by a user is a stochastic process, modeled here as a 
discrete-time moving-average (MA) process. We allow for a heterogeneous population of 
customers, where they are partitioned only by their statistical/stochastic behaviour but are 
considered equal in terms of priority of service. Service guarantees currently provided by 
cloud computing providers (Amazon Web Services' EC2 , Google's Web Toolkit, Microsoft's 
Azure etc.) are weak: Service Level Agreements (SLAs) are available only for quick initial 
provisioning of a new VM from a user onto a compute node, but no guarantees are provided 
on the quality of service experienced by the customer over time. Large organizations with 
significant computing requirements, who are willing to pay for good service guarantees, are 
thus wary of using this architecture for any activity beyond their non-critical desktop usage; 
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Li et al. I (|2009h : lMend~le71 (|2010h . This in particular impedes large-scale adoption of cloud 



computing for time-critical and resource-intensive workloads. 

New techniques need to be developed to address the challenge of estimating performance 
from the user's perspective in this computing paradigm. A key performance indicator in 
multi-user systems measures the latency suffered by users. Latency occurs when access to 
computing resources is throttled because the total quantity of one or more resource required 
(CPU cycles, Memory space, 10 bandwidth etc.) by all the VMs exceed the server's capacity. 
Then, under the most commonly used form of processor sharing discipline, all customers on 
the server are provisioned proportionately lower resources than they had requested and thus 
are said to experience latency. Suppose the server is allocated a capacity that maintains a 
steady per-customer average C p above its expected value. Even if C p is a large number, there 
will be time segments during which the average workload of the server will exceed the total 
capacity. Applications that are intolerant to latency are discouraged from b eing put on clou ds 
in the absence of Service Level Agreements that penalize their incidence ( Li et al. ( 20091 )). 



Therefore, for a company that wishes to guarantee its customers availability of the server's 
resources, it is important to understand how large and frequent such long time segments of 
continued latency can be. We provide a framework to construct such estimates. In particular, 
we use this framework to estimate the time till the first observation of continued latencies of a 
given large time length, and its dual, the largest period of latency experienced within a given 
time. Cloud service operators can utilize this technique to create SLA contracts. In addition, 
the relationship between the expected first observation time and the per-customer average 
capacity can help design system improvements to minimize SLA violations. An operator 
may also provide differentiated service to customers, where those willing to pay for better 
guarantees can be put on an isolated sub-cloud with capacity provisioning tailored to their 
growth, usage and the agreed upon SLA contract. 

Our framework is built on analyzing long strange segments (see definitions (|2.2) and 



(|2.3I)1 of the underlying workl o ad pr ocess of the cloud server; refer Arratia et al. (199(3) 



and Ghosh and Samorodnitsky ( 20ld ) for a review. A standard technique for analyzing the 
rate of growth of long strange segments for stationary processes involves an associated large 
deviation principle (see discussion at the end of Section [2]). While standard probabilistic 
models (for example, queues) operate on stationary processes, the cloud workload process 
is non-stationary (see definition in Section [2]). This is because the total number of virtual 
machines in the cloud environment increases over time. This is a consequence of the fact 
that VMs are software artifacts that are inexpensive to instantiate and operate, and so client 
organizations tend to encourage large-scale adoption and persistent usage of the VMs within 
their organization. In addition, a major new technological innovation allows fast migration 
of VMs between individual physical servers within the same cloud infrastructure. Thus, the 
cloud service environment is better modeled to consist of larger logical servers that each con- 
tinually grow in capacity in order to serve a continually growing population of users, which 
yields a non-stationary workload process. 

The standard large deviation tools that are vital to the analysis of long strange segments 
of stationary processes are thus not useful for our non-stationary workload process. This 
process however has a certain structure that can be gainfully exploited. To take advantage of 
this, we develop a tool for proving uniform large deviation principle that in its most general 
form applies to collections of random variables that satisfy certain regulatory conditions 
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(see Theorem 13.11 in Section [3]). This tool, which is of independent interest, plays a crucial 
role in proving Theorem 12,21 the main result of this paper, which provides a strong law 
characterization of the rate of growth of duration of latency periods as a function of the 
Fenchel-Legendre transform of the log moment generation functions of the underlying process. 
The conditions imposed by the uniform large deviation principle (Theorem 13. 3p admit many 
common models for computer workloads. 

To summarize, the main contributions of this paper are: 

a) We provide a tool for proving uniform large deviation principle for a collection of sequences 
of probability measures. Recall that the Gartner-Eliis Theorem is a very helpful device 
for proy i ng lar g e dey i ation princi ple for a single sequence of probability measures; refer 
Gartner! (| 19771 ). lEllisI (119841 ) and (|Dembo and Zeitounil . 1 19981 . Theorem 2.3.6, p. 44). We 



view Theorem 13.11 as an analogue of the Gartner-Ellis Theorem for proving uniform large 
deviation principle for a collection of such sequences. The conditions imposed on the 
random variables restrict the set of admissible probability laws, but are sufficiently flexible 
to apply to a wide variety of situations. 

b) We provide strong laws characterizing the rate of growth of two performance measures 
of service under the cloud computing architecture, namely the minimum time taken to 
observe a continued latency period of a given length, and its dual the maximum latency 
period that is observed within a given time. 

c) We show, using a motivating example, how these results can be used by a cloud service 
manager to a) create SLA contracts representing a guarantee to the customer against 
chances of observing frequent long latencies, and b) design system improvements to min- 
imize the frequency of long latencies, such as rates at which new capacity should be 
procured/allocated to maintain or improve service. 

The following section describes our model of the cloud environment and states the main 
result of this paper. We conclude the section with a discussion of a representative example. 
Section [3] states and proves the uniform large deviation principle for collections of random 
variables. This is used in Section U] where the main result is proved. 



2. Cloud Model and Main Result 

We model the workload of each user with respect to the instantaneous requirements for a 
single resource, e.g. CPU cycles required, over time. A total of K customer groups are served, 
where groups differ in their workload characterization. The cloud is managed in a manner 
that provisions rii(t) customers from the ith group at time t on each large logical server. The 
function rii(t) is assumed to be a power function, i.e. there exist a positive constant a and 
positive integers c\,...,ck, such that 

ni {t) = Ci [t a \ for alH= 1,...,K. 

For any x € R, [xj denotes the greatest integer less than or equal to x and \x] represents the 
smallest integer greater than or equal to x. The q are chosen to be positive integers rather 
than real numbers. This is solely because of convenience in handling the limit identities 
which appear below; we are certain that taking nj(t) = \ cit a \ for some positive real number 
C{ would not have any significant effect on the results. This form for rii(t) has two important 
implications: first, the relative mix of customers from each group, defined by the ratios of the 
parameters q, remains constant over time, and only the total population of users grows with 
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time. Second, the number of customers remain a deterministic function of time. We believe 
this setting can be easily generalized to allow the number of customers to be a stochastic 
process, e.g. the case where (ni(t), . . . , nft-(i)) are jointly regularly varying with index a and 
the number of customers in the ith group is a Poisson process with intensity rii(t), but we do 
not foresee this situation adding any extra insights to the studied problem. 
The jth customer in the ith group has workload W% j(t) at discrete-time t: 

Wi d (t) =m+ X Lj {t) =m + 0fz{t) + e Lj {t) for all 1 < i < K, 1 < j < m(t),t > 1, 

where /ij is a constant denoting the expected workload of customers in the ith group and 
Xi j (t) is the deviation from the mean workload of the jth customer in the ith group at time 
t. The stochastic process j(t) is further defined as the weighed sum of a AT-dimensional 
moving-average process Z{t) and an additional pure-noise i.i.d. random variables £ij{t). The 
weights Pi G M x are group-specific constants. The noise-process (eij(t);l < i < K,t > 
1, 1 < j < rii(t)) consists of independent and identically distributed (i.i.d.) random variables, 
independent of (£(t),t G Z), with mean zero, satisfying 

A e (A) := logi?[exp {Aejj(t)}] < oo in a neighborhood of 0. 
The process Z{t) is a K dimensional moving average process defined as 

Z(t) = 4>kZ{t - k) for all t G Z, 

k 

with Ylk l^fcl ^ 00 • ^ e wu ^ assume <\> := ^ k 4>k 0- The innovations G Z) are 

-fC-dimensional i.i.d. random variables with mean zero, satisfying 

(2.1) A^rj) := \ogE[exp {n ■ £(t)}} < oo for all <q G M K , 

where for any two vectors x and y, x-y denotes the scalar product. We shall place the following 
additional restriction on the log-m.g.f. A^(-) to satisfy the conditions of the uniform large 
deviation principle (Theorem I3.3|) : 

Assumption 2.1. |^Ag(A/3)| — > oo whenever \X\ —> oo, where (3 := C~ l {^f =1 Cif3i) with 

This mild restriction on the parameters of the MA process is satisfied by realistic computing 
workloads. For example, it admits a Gaussian form for the innovations £. 

The expected workload of the server at time t is given by n i(*)Mi- m our setup the 

number of customers in each group grows over time and so does the expected workload of 
the server. Hence, to keep the system solvent and avoid build up of an infinite queue, the 
capacity of the server must also be continually increased. This can be done, for example, by 
ensuring that the capacity grows in order to maintain a constant ratio of C p with the total 
expected workload. Our imperative is to understand the deviations from the mean workload. 
Define S(t) as the sum of all the deviations until time t: 

t K rn(k) 

S(t) := Yl /Z for alH - L 

k=X i=l j=l 
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and N(t) as the associated normalizing term for time t: 

t K 

N(t) = ^2Y1 for a11 f - L 

fc=i i=i 

By convention, we understand that ^2j =i xi = if j < i. Furthermore, if i and j are not 

integers Y^i=i x i wm denote Z)j£ri] x l- 

We study the average deviation of the workload of the server from its mean over long 
segments of time. For any time segment (k, I) the average deviation is given by 

X(M):- S{,) ~ m 



N(l) - N(k) 

A simple argument using law of large numbers tell us that X(k, I) should not be too far away 
from if I — k is large. If X(k, I) is not close to then we term (k, I) as a strange segment. It 
is also easy to see that if we fix any number L and a threshold e and wait sufficiently long, 
we will almost surely get a segment (k,l) such that I — k > L and X(k,l) > e. Our main 
result describes how the length of these strange segments grow over time. 
For any measurable set A, we define the long strange segments as 

(2.2) Rt(A) := sup {?n : X(l — m,l) € A for some I = m, . . . , ij , 
and its dual characteristic 

(2.3) T r (A) := inf {l : there exists k, < k < I - r such that X(k, I) £ A} . 

The functional R n (A) is the maximum length of a segment from the first n observations 
whose average is in set A. T n {A) is the minimum number of observations required to have a 
segment of length at least n, whose average is in the set A. It is easy to see that Rt(A) grows 
as t —> oo and T r {A) grows as r — > oo. Theorem 12.21 below describes the rate of growth of 
these functionals. There is a duality relation between the rate of growth of these functionals 
which follows from the fact {T r (A) < m} = {R m (A) > r}. If the per-customer capacity of 
the server is maintained at C p units above its expected value then we will take A = (C p , oo). 
For any convex function /(•), we will use /*(•) to denote its Fenchel-Legendre transform: 

f*(x) := sup {Xx - /(A)}. 



For any set A C M, A° and A will represent the interior and closure of A respectively. 
Theorem 2.2. For any measurable set A 

(2.4) i* < hmmf < hmsup < 1 a.s., 

r->oo r r ^oo V 

and 

(2.5) — < hmmf — < hmsup— < — a.s., 

I* t^oo logt t^oo logt U 

where 

h = inf A*(x) and I* = inf A*(x), 
A*(x) is the Fenchel-Legendre transform of A(A) := A^(X<j)f3). 
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Remark 2.3. Under our assumption that the customer group compositions remain constant, 
the customer groups are all jointly represented by their average f3 = (X^i c ifii)/ J2iL\ c i- 

Remark 2.4. We are interested in sets of the nature A = (C p ,oo), where system-stability 
requires that the value C p be set greater than 0. Then, the continuity and increasing nature 
of the Fenchel-Legendre transform over A ensures that the infimum over the sets A and A° 
are achieved at C p . Thus, the upper and lower bounds in (|2.4[) and (|2.5p collapse to give a 
limit result of the form: 

(2.6 km = hm — — = A (C p ) a.s. 

r->oo r t^-oo Rf(A) 

Example 2.5. Suppose that the innovation vectors £(t) are i.i.d. replicates of a K— dimensional 
joint-normal random vector with mean zero and covariance matrix E. In that case A(A) = 
\ 2 (j) 2 (3 T T,f3/2 and hence 

A*(z) = (20 2 /3 T £/3)~V for all iCR. 
Therefore, if A = (C p , oo) then 

(2.7 hm = hm = (2<p f3 S/3) C a.s. 

r->oo r t->oo R t (A) ' 

This yields the estimates T r ~ ex.p{rC 2 /M} and R t ~ M logt/C 2 , where C p represents the 
server 's capacity and M = 2^ 2 ^ T S/3 is a property of the customer classes. As expected, 
higher values of C p slow the rate of growth of the duration T r before observing a latency 
period of length r. On the other hand, higher variability of the innovation £(i) or a higher 
value of \(p\ in the MA process results in a higher value of M and culminates in a faster 
growth of the long latency periods Rt observed in time t. 

Another interesting application is when A = (— oo, —Cp). This can be used to check if 
there are long time periods when the server resources are being severely under utilized. By 
the symmetry of the Gaussian distribution, the estimates for T r and Rt remain the same 
in this case. In particular, if C p were chosen equal to the average workload size, then Rt 
estimates the longest period by time t when the server idles. 

We postpone the proof of Theorem 12. 2l till Section HJ and develop the proper tools required 
for the proof in Section [3j We close this section with a discussion on why standard large- 
deviation tools are inadequate for the proof of Theorem 12. 2[ 



The rate of growth of long strange segments have been studied bv lMansfield et al.l (120011) for 



movin g average processes with heavy-tailed innovations and then by lRachev and Samorodnitsky 



(|200lh : or a long-range dependent moving average processes with heavy-tailed innovations 



Recently iGhosh and Samorodnitsky! (120101 ) studied the effect of memory on the rate of growth 



of long strange segments for a moving average process with light-tailed innovations. A 
strong law of the form f|2 . 5|) is often referred to as the Erdos-Renyi law of large numbers; 
Erdos and Renvi ( 1970l ) proved asymptotics for longest head runs in i.i.d. coin tosses. 



It is instructive to take a heuristic look at the standard technique of proving the rate of 
growth of long strange segments for a stationary process, say (It). A vital tool for analyzing 
this growth is a large deviation principle associated with the partial sums of (It). Recall that 
a sequence of probability measures (Pt,t > 1) satisfies large deviation principle (LDP) on 1R if 
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there exists a non-negative lower-semicontinuous function /(•) such that for any measurable 
A C R 

(2.8) - inf I(x) < liminf -log P t (A) < limsup - log P t (A) < - inf I(x), 

x<=A° t-t-oo t t->oo t x&A 

The function /(•) is called the rate function. A rate function with compact level sets is called 
a good rate function. 

Denote the average of the segment (k, I) by 



Y(k,l) 



El 



■k+1 



Y 



l-k 



It is often possible to show that the law of Y(0, t) satisfies an LDP under assumptions 
of mixing or oth er specific structu r e on ( Yt) and exi s tence o f exponential moment s of Y ; 
see for example Brvc and Dembol ( 19961 ). IVaradhanl ( 19841 ). iDembo and Zeitouni ( 19981 ). 
Deuschel and Stroockl (|l989l ). Then for a 'nice' set A such that E(Yq) £ A there exists 
I > such that for t large 

iogP[r(o,t) ei]~ -it. 

Using stationarity, this implies log P^Y (I, I + t) € A] ~ —It for every I > 0. Heuristically, 
this means that for approximately e 11 segments of length t, we can expect to find one with an 
average would be in A. The segments (0, t), (1, t + 1), (2,t + 2), . . . are not independent but 
that is handled t ypically using mixing type c ondit ions borrowed from the process (Yt) itself. 
Theorem 2.3 in Ghosh and Samoro dnitskvi (j201Ch is an example of this line of argument 
where the authors consid er moving average processe s and use the large deviation principle 
for partial sums proved in I Ghosh and Samorodnitskvl (12009^ to obtain asymptotic results for 
the rate of growth of long strange segments. 

In our application's setting, the distribution of X(l, I + 1) differs from that of X(0, t) when 
I > 0. This is because the growing number of customers in the system implies that each 
X(l, I + t) represents an average over different number of realizations (N(t + I) — N(l) versus 
N(t)). So, in order to understand the rate of growth of the long strange segments we need 
to estimate the probability P[X(l, l + t) G A] uniformly over I > 0. We address this problem 
by proving the uniform large deviation principle in Theorem 13.31 A collection of probability 
measures (Pk,t,t > £ T) satisfies large deviation principle on K uniformly over k € T 
if there exist non-negative lower-semicontinuous functions (Ik(-),k E T) such that for any 
measurable 4cR 



(2.9) 

and 
(2.10) 



liminf inf <! - log P k t (A) + inf I k (x) 
t-*oo ker I t xeA° 



lim sup sup 



- log P k>t (A)+ mf_I k (x] 

t x€A 



> 



< 0. 



Note that bounds (|2.9p and (|2.10p are generalizations of the LHS and RHS of the standard 
large-deviation bounds in (|2.8p . 
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3. Uniform Large Deviation Principle 



The Gartne r- Ellis Theor em is an important tool for prov ing large deviation principle, cf. 
Gartner! (Il977l ). lEllis I (Il984l ) and (iDembo and Zeitounil . fl998l . Theorem 2.3.6, p.44). Theorem 



13.11 is an analog of the Gartner-Ellis Theorem for proving uniform large deviation principle. 
We use this theorem to prove uniform large deviation principle for the average of segments 
of the server workload process in Theorem 13.31 which is in fact the first step in proving of 
Theorem 12.21 

Theorem 3.1. Suppose (Y k) t,t > l,i 6 f) is a collection of random variables such that 
there exists (A fc (-), k G T) which are differentiate and satisfy the following conditions: for all 
< L < oo and e > there exists T > and 5 > such that 

A k (X) -ilog£[exp{iAF M }] =0, 



(3.1) 

(3.2) 

(3.3) 

and 
(3.4) 



lim 

t— >oo 



sup 



fcer,|A|<L 



sup 

fcer,t>T,|A|<L 



logE[exp{t\Y ktt 



< oo, 



inf 

fcgr 



(A fc )'(A) ->• oo whenever |A| -> oo, 



(A )'(Ai) - (A )'(A 2 ) <e for all |Ai - A 2 | < S,X 1 ,X 2 G [-L,L],k G T. 



Then for any closed set F C 
(3.5) 

and for any open set G C 



1 



limsupsup <^ - logP [Y ktt G F] + inf A k *(x) } < 



(3.6) 



lim inf inf <i - log P \Y k t G G] + inf A k *(x) )■ > 
i^oo fc 6 r 1 t 1 ' J xeG v ; ' ~ 



where the rate function A k *(-) is the Fenchel-Legendre transform of A k (-). 



Remark 3.2. It can be observed from the proof below that conditions (|3. 1 1) . fj3.2[) and fj3.3[) 
have been used to prove (|3.5p . whereas, all the conditions (|3.ip - (|3.4p are required for proving 
(|3.6p . Condition (|3.ip requires that the normalized log-m.g.f.s of Y^^ converges to A k (X) 
uniformly over k G T and locally uniformly in A G M.. Condition ()3.2[) ensures uniform 
exponential tightness of the random variables (Y k t ). Condition (13.31) is the equivalent of the 
steepness assumption imposed by the Gartner-Ellis theorem, cf. (jDembo and Zeitouni . 19981 . 
Theorem 2.3.6, p.44). Condition (|3.4p requires that the functions (A fc )'(A) are continuous in 
A, uniformly over k G T and A in a compact subset of BL This ensures that the Fenchel- 
Legendre transforms A k *(x) are continuous in x, uniformly over k G T and x in compact 
subsets of K. 



Proof. We will first prove (|3.5p . As (|3.5p holds trivially when i 7 = 0, we can safely assume 
that .F is non-empty. To begin with suppose F is compact. Fix any x G -F and 5 > 0. Since 
A fc (-) is convex, continuously differentiable and satisfies (|3.3p . we can find A£ G M such that 

(A fc )'(A^) 



,r. 
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A fe *(x) = sup{Ax - A fe (A)} = X k x x - A k (X k ) 



This would imply 



From (|3,3p we also know that {X k : k G T} is a bounded set. Hence we can find an open 
neighborhood A x of x such that 

inf X k (y -x)>-5 for all k G T. 

y&A x 

Then by Chebychev's inequality we get an upper bound for the following probability 
P[Y k , t G A x ] < s[exp {x k x t(Y k>t - x)}] exp { - t inf X k (y - x)} 



which implies 



- \ogP[Y k>t eA x ] < i logS [exp {A**y fc>t }] - X k x + 5. 



t 1 ' J ~ t 
From (|3.ip we can get T > 1 such that for all £ > T and G T 



-logS 
i 



exp{A^y M }] < A fe (A^) +5 



and this means for t > T 
1 



(3.7) 



t 



log P[Y kit G < A fc (A^.) - X k x + 25 = -A k *{x) + 25. 



Now, obviously U xe pA x is an open cover of F and since F is compact we can obtain 
xi,...,xn G F, such that F C Ui^^atA^. Then by a simple union of events bound we 
get for t > T 

- log P\Y kt G F) + min A fc *(x;) < -logN + 25 for all fc G T. 

t l<i<N t 

It is now easy to see that for t >T 

supj -logPTYfci G F] + inf A k *(x)\ < -logN + 25 
fcgr I t ' x&f J t 

and since < 5 < 1 is arbitrary 

(3.8) lim sup sup { - log P \Y k t E F) + inf A k * (x) \ < 0. 

This proves (|3.5p when i* 1 is compact. 

Next we extend the above result to any non-empty closed set F. First we note a few facts. 
Using (|3.ip and (|3.2p we get that for any 5 > 

c := sup |A fc (A)| < oo. 
fcer,|A|<<5 

Since {X x : k G T} is bounded and A k *(x) = X k x - A k (X x ) we get that sup fcgr A k *(x) < oo. 
Furthermore, for all k G T 

A k *(x) = sup {Ax - A fc (A)} > sup {Ax - A fc (A)} > 5\x\ - c. 

|A|«5 



Hence for any closed set F there exists Mi > such that 



(3.9) 



inf A fc *(x) forall^Gr. 
xeFnl-M^Mi] 
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Also note that for any k G T and t > 1 
1 



1 



\ogP[\Y k)t \>e]<-e+ sup - io g e 

t fcer,t>i t 



+ sup - log E 

fcer,*>i t 



-tY k . 



and therefore 
(3.10) 

Now set 



lim limsupsup - logP[jYfc jt | > #] = — oo. 



c' = sup inf A k *(x) 



Since for any x, supj, e r A fc *(x) < oo we get that d < oo. Note that if d = then the proof is 
immediate. So we look into the case when d > 0. Using f|3. lOj) we can get M<i > such that 

P[\Y k>t \ > M 2 ] < e~ 2c '* for all k G T, t > 1. 
Let M = max{Mi, M 2 }. Note that from flSSJ) and j53]) 



lim sup sup { - log P [y fc t G F n [-M, M]l + inf A fc * (a) 

= lim sup sup I - log P\Y kt G Fn [-M,M]1 + inf A fc *(x)l<0. 
t^oo fcgr I * zeFn[-M,M] J 

This means that for any given 5 > we can find T > 1 such that 

- log P [y fc>1 G P n [-M, M]] + inf A fc * (x) <5 for all jfe G T, t > T. 



t " L ' JJ ief 

Now if P[Yfc 5 t G P n [~M,M]} < P[\Y kit \ > M] then 



Otherwise, 



-logP[F M G P] < - log 2 -2d. 



- t log P [y M g P] < i log 2 + i log P [Y ktt g P n [-M, M]] . 



Therefore, in both the cases, 



-logP[Y kjt G P] + inf A k *(x) < ilog2 + 5 for all k G T,t > T. 
lim sup sup { - log P [y fc t G Pi + inf A fc * (x) \ < 0. 



and hence 



This completes the proof of ([3.5 1) . 

We will now prove (|3.6p . Note that we can find M > such that 

A fc *(x) for all k G T. 



inf A fe *(x) 



inf 

xGGn[-M,M] 



Fix any e > and get I'eGfl [— M, M] such that 



A fc *(x fe ) < inf A fc *(x) + e/2. 
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Another observation that we need to make is that we can find 5 > such that 
A k *(x) - A k *(y) <e/2 for all \x - y\ < 5, x, y G [-M, M], k G V. 

This follows easily from (|3.4p . Now obviously U x eGn[-M,M]Bx,8 is an open cover of G n 
[—M,M], where B Xi s = (x — 5, x + 5). Since G n [— M, M] is precompact, we can find 
xi, . . . ,x n G G n [— M, M] such that for all x fc there exists 1 < i k < n for which \x k — Xi k \ < 5. 
This implies that 



inf A**(xi) < inf A fc *(x) + e for all fceT. 

l<i<n x£G 

For notational simplicity we define X = {xi, . . . , x n }. Let 5' > be such that B Xt s> C G for 
all x G A. Now fix any Define the random variables Y kt t by an exponential change of 

measure such that 



P[Y k)t eB] 



Then 



and 



P[Y ktt eB XtS ,]=E[e tx * Y ^]E 



- log P [y M G 5^] = - log E [e tA * Y ^\+-\ogE 



Uo g E[e tx " Y ^]+- t 
>- t \ogE[e tx " Y ^]-\ k 



X 



1 



5' + -logP[Y M €B M / 



We claim that 
(3.11) 



1 



lim inf - log P \Yf. t G B x g'] = 0. 



To remain with the flow we complete the proof of (|3.6p assuming (|3.1ip . which we prove 
at the end. Let M' > be such that |(A fc )'(A)| > M for all |A| > M' and k G T. From 
assumption (|3.3p we know that M' < oo. We can also get T > 1 such that for all t > T and 

x G A, 

inf 1 log P[y M G P M >] >-e 



and 



sup 

fcer 



1 



A^-^logP 



< e. 



This implies for all t > T, x G X and fcGT 

i log P [Y ktt G G] > \ log P [y M G IV] > A fc (A*.) - A^x - M'5' -2e = -A k *(x) - M'5' - 2e. 

Since x G X is arbitrary and M', 5' and e are independent of the choice of x, we get for all 
t > T and k G T 

- log P[y M G G] > - inf A fe *(x) - M'5' - 2e > - inf A fc *(x) - M'5' - 3e. 



Hence we get 



lim inf sup j - log P [Y fc t G G) + inf A fc * (x) 1 > -M'5' - 3e. 
t-Hx fcer [ i L J x£G J 
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This completes the proof of (|3.6I) since 5' and e can be chosen arbitrarily close to 0. 

It now remains to prove (|3.1ip . Since X is a finite set, it suffices to show that for any 
x G X 

lim M^]ogP[Y k , t €B Xi y] =0. 
We will use the upper large deviation bound (|3.5p for that purpose. Note that 
ilog£[e* A ^] = Ii og £[e*(A+A£)Y fc , t ] _ \\ og E [e tx * Y ^] 
-> A k {X) := A fc (A + A*) - A fe (A£). 

It is easy to check that A fc (-) inherits the properties (|3TT]l . ([32]), ([S3]) and ([33]) from A fe (-). 
Therefore, since BJj, :={i£l:i^ is a closed set, by (|3.5p 

(3.12) lim sup sup J \ log P[Y k>t G + inf A k *(y) \ > 0. 

Note that (A fe )'(0) = x for all A; G V and that implies A fc *(x) = for all k G V. Since A fc *(-) is 
nonnegative and convex ini y ^B c ( A k *(y) > min{A fc *(x — 5'), A fc *(x + 5')}. Now get a compact 

set K' such that |(A fc )'(A)| > |x| + 5' and then find rj > such that 

(3.13) (A fc )'(A') - (A fc )'(A") < S'/2 for all |A' - A"| < 7?, A', A" £K',ke V. 

Then get A*j + and A*_ such that {A k )'(X k + ) = x + 6' and (A k )'(X k _) = x - 5'. From fl3~T3]) 
we know that X k + > 77 and X k _ < —77 for all k G T. Therefore, for all fc G T 

/>A fc 

-\* (x + «0-A*(A* + ) = A* H 



and 



A fe *(x + 5') = A* + (* + 5') - A fe (A* + ) = A* + (* + 5') - / m'(z)dz 

Jo 

> a^ + (x + 5') - (x + 572)7? - (A* + 5') = 



A**( X _ 5') = ~X k x _(x - 5') - A k (~X k .) = ~X k + (x + 5') + [° {A k )'{z)dz 

> At (x - 5') + (1 - (572)77 + (A* + - t?)(x - 5') = 77*72. 

This implies that min{A fc *(x - 5'), A fc *(x + 5')} > r/57 2 for all /c G T and hence using (|3. 12[) 
we get 

limsupsup-logP[y fc)t G < -77572. 

t-¥oo fcer J 

This also means that 

lim inf P[Y ktt e B XtS ,] =1. 
This proves (I3.1ip and hence completes the proof of the theorem. □ 

Theorem 13.31 allows us to approximate the probability of deviation from of the average 
X(k, I) for different segments (k, I) when I — k is large. This is a vital component in the proof 
of Theorem El 
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Theorem 3.3. If Assumption \2.1\ holds then for any measurable set icl 

(3.14) limsupsupjjlogP [X(kt,(k + l)t) € A] + inf A fc *(x)l <0 
and 

(3.15) Urn inf inf j-logP [X(kt, (k + l)t) G A] + inf A fc *(x) 1 > 
where the rate function A k *(-) is the Fenchel-Legendre transform of 



13 



(3.16) A fc (A) : = / A ? 

Jk 



k+1 



(a + l)A^y a 



( fc + l)a+l_ fc a+l^J ^' 

and Ag(-) is as defined in (|2.1|) . 

Proof. The result will follow once we check that the conditions of Theorem 13 . 1 1 hold by setting 

n,, : = .?(,«, (* + l) f ) . fa all t € N,fc € R + . 

The most complicated part is to check the uniform convergence condition (13.10 : for any 
< A < oo 



(3.17) 



lim sup 



A fc (A) - - logEexp {t\X(kt, (k + l)t)} 



We begin by observing that for any u € 



log£ 



exp |it(5((fc + l)t) — S(kt)) | 



log£ 



log£ 



(k+l)t K ni(l) 
exp<] u ^ ^2^2 X i,j( l ) 

l=kt+l i=l j=l 

(k+l)t X (fc+l)i ft: ni(0 

exp^u ^nS^fZ^ + u £ £ E e «(0 

i=fet+l i=l l=kt+l i=l j=l 



(3.18) 



iog£ 



(fe+l)t K 

exp<j« ^ ^n,(0/3fZ(0 
Z=fct+1 i=l 



+ iog£; 



j=jfct+i i=i i=i 
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where the last equality follows from the independence of the e's and the Z's. To understand 
the first component of (|3.18p . define f3 = ^2 i=i Cifii and note that 



logS 



(k+X)t K 
l=kt+l i=l 



logE 
logE 
logE 



exp < 



K (k+l)t oo 

U Y.^{ E E <!>*£(! -j) 



i=l l=kt+l j=—oo 

K (k+l)t oo 

u (E^)-( E LH E MG-J 

i=l l=kt+l j=—oo 

oo (k+l)t 

^■( E E LH 

j=—oo l=kt+l 



(fc+l)t 

£ A ? [n/3 £ LH^]- 

Z=fct+1 



j=-oo 



Using the triangle inequality we get the obvious bound 



(3.19) lim sup 

fc>0,|A|<A 



< lim sup 



A fe (A) - - log E exp {t\X(kt, (k + l)t) } 



t— >oo 



fc>0,|A|<A 



+ lim lim sup 

i ^ oot ^ oo fc>0,|A|<A 



+ lim lim sup 



(k+i)t ( 

A fc (A) -Jem 
j=kt+i \ 

kt~L 



tx 



N((k + l)t) - N(kt) 
tX 



(k+l)t 

P E L« a J0i-i 



j=-oo 
oo 



j=(k+l)t+L 



N((k + l)t) - N(kt) 

tx 



l=kt+l 

(k+l)t 

(k+l)t 



N((k + l)t)-N(kt)\]^J a ^-> 



+ lim sup 

*^°°fc>0,|A|<A 



+ lim sup 



t— ¥OC 



fc>0,|A|<A 



E 



Ac 



tx 



(k+l)t 



kt-L<j<kt Or 
{k+l)t<j<{k+l)t+L 



N{(k + l)t)-N(kt) P ]^J a ^ 



logE 



tx 



t K ni(l) 



exp 



N((k + l)t)-N(kt)^^ 



We will prove (|3.17p by showing that each of the term in the above expression is equal to 
0. For that purpose we make use the following facts: 
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(i) there exists M' > such that 

t((k + i)ty 



< M' for all* > l,fc > 0. 



(kt + l) a -\ h ((k + l)t) c 

(ii) Given any < e < 1/2, there exists K\ > such that 

|A^(ti) — A^(i>)| < Ki\\u — v\\ whenever ||u|| < M, \\v\\ < M and \\u — v\\ < e, 
where || • || denotes the sup-norm on W K and 

oo 

M = M'A\\P\\ \M> 

k=—oo 

(hi) and there exists L > 1 such that J2\k\>L l^fcl < e /( M ' A ll^ll)- 

We ge t (ii) since A? (•) is convex and differentiable (cf. Lemma 2.2.5 iDembo and Zeitouni 
( 19981 )) and (iii) follows from the summability of the coefficients {4>k)- 
Define the function ft f. : (k, k + 1) — > R by 



f tik (y) := A ? 



/A 



N((k + l)t) - N(kt) 



(k+l)t 

E L«°J^i- r *i)/3 
i=fet+i 



and note that 



(fc+i)i 

E a 



tx 



t 



(k+l)t 



fc+i 



ft,k{y)dy. 



i=kt+i 

Choose t large enough such that kt + 1 < |~ty] — L, \ty] + L < (k + l)t and 

(a + l)y Q 



N((k + l)t) - N(kt) C((k + l) a+1 - k a+v 



< 



A||/3|| 



E 



k=— oo 



for allA;>0, /c + e<y<A;-|-l — e and [ty] — L < I < \ty~\ + L. It is easy to check that for 
y in this range and |A| < A 



txp 



(k+l)t 



N((k + l)t)-N(kt) ; E +i ^-™ 



a/3 



and 



tA/3 



\ty\+L 



N((k + l)t)-N{kt) i 
(a + l)Ay a /3 



E U a Jfc-r*i 



and 



(A; + 1) Q+1 — ^ 



(k + l) a+1 - /c a+1 
(a + l)Ay a /3 











E 




i=r*y|-£ 




oo 

E ^ 


< e. 


(=— oo 





< e 



< e 



This implies for all A; > 0, k + e<y<k + l — e and |A| < A 

(a + l)A</>y c 



A, 



(k + l)^ 1 -k a + 1 ' 



/3 - ft, k (y) 



< 3nie 
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and hence we get 
(3.20) 



lim 

t—>oo 



sup 



5 fe>0,|A|<A 

where 



(fc+l)t / 

A fc (A) - \ Yl A « 

j=fct+i \ 



tA 



iV((fc + l)t) -N(kt) 



(fc+l)i 

/3 ^ [Z a J0i-; 



z=fct+l 



< 3Kie + 4Mie, 



(3.21; 



M 



1 =maxiAJM / A||/3|| ]T |0 fc | , A J -M'A\\(3\\ ]T 



Obviously, since e is arbitrary we get that the limit in (|3.20p is 0. 

The other parts in (|3.19j) are handled much easily. Note that for any k > 



kt-L 

E A « 

j=—oo 



tx 



N{{k + l)t)-N{kt)\^ +i lu 



{k+l)t 

E \y 



kt-L (k+l)t 

< KiM'ApH Y Y |^-i|^ tK i £ 

j=—oo l=kt+l 



and hence 
(3.22) 



sup 

fc>0,|A|<A 



(k+l)t 

-b y [i a \^ 



kt-L 



j=-oo 

Using a similar argument we also get 



N((k + l)t)-N(kt)' 



(3.23) lim lim sup 



L— ¥oo t— >oo 



fc>0,|A|<A 



1 oo 

7 E ^ 

j=(k+l)t+L 



tx 



l=kt+l 



(k+l)t 



0. 



N{(k + i)t)-N(kty t 



P Y L« a J« 



n-j 



--kt+l 



0. 



Furthermore, it is also easy to check that for every L > 1 



lim sup 

*^°°fc>0,|A|<A 



E 

kt-L<j<kt Or 
(fc+l)i<j<(fc+l)t+L 



Ac 



/A 



N((k + l)t) - N(kt) 



(k+l)t 

p y v ia \^-o 

l=kt+l 



(3.24) 



< lim -2(L + l)Mi = 0. 



For the final part of the proof of (|3.17|) . we note the following facts about A e (-): A e (0) = 0, 
Ag(0) = because E(eij(t)) = 0, A e (-) is nonnegative and twice continuously differen- 
tiable in a neighborhood of 0. The last fact can be easily derived following Lemma 2.2.5 in 
Dembo and Zeitounil ( 19981 ). This implies that there exist positive constants k and i] such 
that 



|A e (u)| < ku 2 for all \u\ < r\. 
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Choose t large enough such that tA/N(t) < rj. This also means that \tX/(N((k + l)t) 
N(kt))\ < r\ for all k > and |A| < A. Hence, we have 







logS 


exp | 



tx 



N((k + l)t) - N(kt) 



t K rii{V) 

i=i i=i j=i 



(fc+l)t K rii(l) 

E EE*. 



tx 



l=kt+l i=l j=l 

< (N((k + l)t) - N(M))k 



N((k + l)t) - N(kt) 
t 2 X 2 



(N((k + l)t)-N(kt)) 



2 > 



This immediately gives us 



lim sup 



t— >oo 



fc>0,|A|<A 







-log£ 

t 6 


exp | 



, t K ni{l) 

N{(k + l)t)-N(kt) |J |J £ ^ (0 



(3.25) 



< lim k sup 



/A 



fc>o,|A|<A (JV((/c + l)t) - N(kt)) 



0. 



and that completes the proof of (|3.17p . 

It is simpler to check the other conditions of Theorem 13.11 Note that we can find M such 
that 



V 



(3,26) (fc + l)«+i-fc 

This implies that for any A > 



— — < M for all k > 0,k < y < k + 1. 

a+l — —■>—&—> 



sup 

fc>0,|A|<A 



A fc (A) 



< CO, 



and this combined with (|3,17p shows that the condition ()3.2p holds. 

Next we check that A fc (-) is differentiable. Since Ag(-) is finite everywhere, by Lemma 2.2.5 

in 



Dembo and Zeitouni (|l998l ) we get that A^(-) is differentiable and 

E[t(p)e"W)] 



For any 5 satisfying < \\5\\ < 1 

and 



ze (v+5>z _ ze v* 



E[ei-W)] 

iv+S)-z _ ze V*\\ < h ( z ) .- || z || e ^( e ll^ll + 1). 



Since E[h(£(0))] < oo using the dominated convergence theorem we get that E[£(0)e x ^] is 
continuous. This implies that A^(-) is continuous. Now we can use the Leibniz integral rule 
(cf. Theorem 7.40 in lApostoll (| 19741 )) to get that A fc (-) is differentiable and 



(A fe )'(A) :- 



k+l 



(a + l)<t>y c 



(k + 1)°+! - k^ 1 



(a + l)Xcj)y a 



(k + l)^ 1 -k 



^+lP)dy 



18 



S. GHOSH AND S. GHOSH 



It is easy to see that ||A^(?/)|| — > oo whenever \\r]\\ —> oo. This combined with (|3.26p shows 
that (j3.3[) holds. Finally, (|3.4p follows from the fact that A^(-) is continuous on compact sets 
and (I3.26p . This completes the proof of the theorem. □ 



4. Proof of Theorem 12.21 and Required Lemmas 

Proof of Theorem \2.2l We will first prove the lower inequality in (|2.4[) . The inequality is 
obvious when I* = 0. Also, if A is nonempty then I* < oo from Assumption 12.11 So it 
suffices to consider < I* < oo. We will use the simple inclusion bound: for all m > 1 and 
r > 1 

oo m—l 

{T r (A) < m} C |J |J {X(j,j + 1) 6 A) . 

l=r j=0 

Thus we get 

oo m—l 

P[T r {A) < m) < E P[X(JJ + l) 6 A ] 
l=r j=0 

Lemma [4.1l below shows that the A k *(x) are an increasing function of k for fixed x. Lemma l4.2^ 
which builds on this, gives the existence of a Kq such that 

inf A k *(x) > inf A*(z) - e/3 = I* - e/3 for all fc > AT . 

We can also find, from Lemma 14.31 a constant I > such that I < inf^g^ A fc *(x) for all 
k > 0. Now for any < e < /, by Theorem 13.31 we can get T > 1 such that for all / > T and 
all A; > 



X(fcZ, (jfe + 1)Z) G A < exp <j -I ( inf A fc *(x) - e/3 



)} 



This gives us for r >T 



oo m—l 



P[T r (A) < m] < E E + G ^ 

oo KqI 

<EE P [^' +/ )H +E E ^O'.j+oe^ 



oo m—l 



i=r j=K l 



l=r 



Now set m = |_e r ( 7 * _e )j and note that 

oo 



r=l 

oo oo oo oo 

<T + EE Kole- l{I ~ €/3) + E e r{7 *" e) E - < oo. 



-i(7*-2e/3) 



r=T l=r 



r=T 



l=r 
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Hence, using the Borel-Cantelli lemma we get 

limmt > — e a.s. 

r— >oo r 

The lower inequality in (|2.4j) is thus proved by letting e — > 0. 

Also observe that, using the relation {T r (A) < m} = {R m (A) > r} we get 

nmsup < — a.s. 

t^oo log i i* 



In order to prove the upper bound in (]2.4p it suffices to consider the case I* < oo. In that 
case the set A has nonempty interior. Define two new random variables by 

yl ._ ( X>j=kt+1 ELfetll LH^-j^CO , y» ._ wu + Y ' 

Yk,t-P N((k + l)t)-N(kt) ^ Y k,t-X(kt,(k + l)t)-Y k<t , 

where, as before, j3 = X^=i C «A- For a set A and r\ > 0, define 

,4(7?) := {x : d(x,A c ) > v }, 

and <i(x,A c ) is the distance from the point x to the complement A c . Now observe that for 
any positive integers r and q with q > r 

P[T r (A) > ?] 

< P [X(Ar, (fc + l)r) £ A, Jfe = 0, . . . , Lg/rj] 

L9/rJ 

< P [Y£ r £ Afa), fc = 0, . . . , [q/r\] + Y,P [Kr\ > V] 

i=i 

and since Y£ r , k = 0, 1, . . . L<7/ r J ar e independent 

Is/'J La/rJ 

= n (i - p [y fc ', r e a(t,)] ) + p [in'; r i > t?] 

k=0 1=1 

\q/r\ \q/r\ 

< ex P ( - £ p [y fe ' ir e Afa)] ) + £ p [|y£ r | > r?] . 

fc=0 i=l 

From the arguments following (|3.20p it is easy to check that the law of Y£ t satisfy large 
deviation principle uniformly over k > with rate function A k *(-). We can therefore, get 
T > 1 such that 

- log P\Yl t G AM] > - inf A k *(x) - el A for all t > T, k > 0. 

t ' xeA(rj) 

Lemma l4.1k w) then implies 

- log P\Yl t G Air])} > - inf A*(x) - e/4 for alii > T, k > 0. 
Hence for rj > small enough 

- logP[^ jt G A(r?)] > —I* - e/2 for all t > T, fc > 0. 
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Therefore, by setting q r = \e r ( I * +e J] and using the above inequality we get that 

oo llr/r] 

^exp(- P i Y k,r e A( V )] 

r=l k=0 

oo r (7*_|_ £ ) 

< T + J2 ex P ( - e —^— e" r(/ * +e/2) 

r=T 

oo re/2 

(4.1) <T+E ex p("- )<oc. 

r=T T 

Furthermore, note that for e > and n > such that the above holds 

lim sup sup j log P [ | Yl' t \ > rj\ 

t->oo fc>0 1 

< —At? + lim sup sup - log E [Ai| Y^'J = —Xr]. 

The last equality follows from the steps used in the proof of Theorem 13.31 Now by choosing 
A > (J* + e) ft] we get 

oo Vlr/r\ oo 

(4-2) E E P [Kr\ >r ? ]<E|- - P P > ?,] < ex. 

r=l 1=1 r=l 

Combining (|4.ip and ()4.2p we get 

oo 

£P[T r (A)>g] <oo. 



r = l 

Finally by applying the first Borel-Cantelli lemma and then letting e — > we complete the 
proof of the upper bound of (|2,4p . The lower bound in (j2.5[) is again proved using the same 
identity {T r (^4) < m} = {R m (A) > r}. Hence the proof is complete. □ 

Lemma 4.1. (i) For any A G R, A fc (A) is a decreasing function of k. 
(ii) For any A k *(x) is an increasing function of k. 

Proof. Suppose Fj~ is the distribution function of the random variables 

U k := + + where jj „ UniformfO, 1), k > 0. 

Observe that E{Uk) = 1 for all k > 0. Also, for any non-negative random variable X with 
mean 1 and distribution Fx, define the Lorenz function 



rp 

L x (p)--= / Fx l {u)du, forallO<p<l. 
J o 



Note that the Lorenz function of Uk is given by 



L Uv (p) = , P { f— , for all < p < 1. 
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and 



d r . . (a + 1) Uk + p) Q ((fc + 1) Q (1 - p) + k a p) -k a {k+ l) a ] 
g- k L Uk (p) = „. > 



((Jfc + l) * 1 - /c Q+1 ) 2 
for all k > and < p < 1. This implies that 

Lu k , (p) > L Uk „ (p) for all < p < 1 , k' > k" 

which means that Uf. is decreasing in Lorenz order as k increases. Hence by (lArnoldl . ll980L 
Theorem 3.2, p. 37) and using the fact that A^(-) is convex and continuous we get that 

A k (\) = E[A^X<pU k p)] 

is decreasing in k. 

Part (ii) of the lemma follows easily from part (i) using the definition of Fenchel-Legendre 
transform. □ 

Lemma 4.2. For any measurable set ^Ct and e > there exists Kq such that 

inf A k *(x) > inf A* (a;) - e for all k > K 

where A fc *(-) and A*(-) are as described in Theorem 1 3. #1 and Theorem ] 2. respectively. 
Proof. Fix any e > 0. From the arguments leading to (|3.9p we can find M\ > such that 



Lemma \4.1]( ii) then gives us 



inf A* fx) = inf A*(x). 

x£A x£AC\[-Mi,M{\ 



inf k k *{x) = inf A k *(x) for all k > 0. 
xeA xeAn[— Mi,Mi] 

Using Assumption [2TT1 we get M 2 > such that |A| > M 2 implies |(A°)'(A)| > 2M\. Since 
A fc (-) converges locally uniformly to A(-) we know that there exists Kq such that 



(4.3) 



sup 

Ae[-M 2 ,M 2 ] 



A fc (A) - A(A) < e/4 for all k > K Q . 



Now, for any x € [-Mi, Mi] we can get X x G [-M 2 ,M 2 ] such that A x x- A(A X ) > A*(x) - e/4 
and therefore for all A; > i^o 

A k *(x) > X x x - A k (X x ) > X x x - A(X X ) - e/4 > A*(x) - e/2. 

This implies for all k > Kq 



inf A fc *(x) > inf A*(x)-e 

a;G^n[-Mi,Mi] " seAn[— Mi,Afi] 



and that completes the proof. 

Lemma 4.3. For any measurable set A C 



□ 



(4.4) 



inf A* fx) > implies inf inf A k *(x) > 0. 

xGA fc>0a;GA 
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Proof. Using Lemma \4.1\f ii) it suffices to show that (|4.4jl implies inf xG ^ A°*(x) > 0. Fix any 
x ^ 0. Since A (A) is strictly convex and finite everywhere and (A°)'(0) = 0, we get that if 
(A°)'(A°) = x then A° ^ 0. Then A°*(x) = X° x x - A°(A°) ^ 0. If for some measurable icK 
then 

inf A * (x) = implies e A. 
That would imply inf-cgA A*(x) = 0. This proves the lemma. □ 
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